NetNews Offline 2

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Offline 2 / NetNews Offline Volume 2.iso / news / comp / sys / amiga / programmer / 4941 < prev next >

Wrap

Internet Message Format | 1996-08-05 | 6.0 KB

Path: ifi.uio.no!usenet From: ludvigp@ifi.uio.no (Ludvig Pedersen) Newsgroups: comp.sys.amiga.programmer Subject: Re: doubling pixels horizontally Date: 6 Mar 1996 18:52:33 GMT Organization: Dept. of Informatics, University of Oslo, Norway Message-ID: <5257.6639T1152T2935@ifi.uio.no> References: <4f4ibc$gl9@news.cs.tu-berlin.de> <591.6610T1165T2102@login.eunet.no><1045.6611T753T2256@vip.cybercity.dk><4faoe1$47@sunsystem5.informatik.tu-muenchen.de><2991.6612T1034T625@vip.cybercity.dk><576.6613T1070T1730@login.eunet.no><1257.6614T57T922@vip.cybercity.dk> <1982.6617T1096T103@ifi.uio.no> <4gbjg3$104@sunsystem5.informatik.tu-muenchen.de> <4518.6625T1142T92@ifi.uio.no> <4h4hv5$mnn@sunsystem5.informatik.tu-muenchen.de> <2444.6635T982T1557@ifi.uio.no> <4hhjlv$5qb@sunsystem5.informatik.tu-muenchen.de> NNTP-Posting-Host: gymir.ifi.uio.no X-Newsreader: THOR 2.22 (Amiga;TCP/IP) >|> >Maybe I'm the only one but I can read/build code much better that way. >|> >And I could read my code well :) >|> WOW!...do you have any super-natural powers? ;^) >grrrrrrrrrrrrrrr :) no that's just a subjective thing. hehehe....sorry about that! (grin) I just couldnt resist. :-) >Code gets more structured (grrr don't laugh ;) and though more overview. >Puting instructions next to each other which are related to each other >subq.w #1,a5 : cmp.w #0,a5 : bne loop ;out of data registers >just one structure level more, code gets 2 dimensional (blah:) like >in C code. And as asm needs more instructions than C, it needs the >2 dimensional format even more if you don't wanna lose overwiev. baeh! >:) Asm was never design for it, and I don't think it looks good either. >|> >|> On my A1200 7mb/sec is not copy speed but chip write speed. >|> >mhm, all people told me the blizzard will _copy_ 7mb/sec. >|> >a myth ? >|> I think so. But please show me the copy-loop and I'll test it. >could you please try movem.l (fast)+,d0-d7 and then 8 times move.l dn,(chip)+ >? I did tried a LOT of different loops and here is a small collection of the top 5 loops I tried. Acutally the result was a little better than I thought. ALL DMA IS OFF! ;Speed: 5.640 MB/s move.l (a0)+,d0 move.l (a0)+,d1 move.l (a0)+,d2 move.l (a0)+,d3 move.l (a0)+,d4 move.l (a0)+,d5 move.l (a0)+,d6 move.l (a0)+,a2 move.l (a0)+,a3 move.l (a0)+,a4 move.l (a0)+,a5 move.l (a0)+,a6 move.l d0,(a1)+ move.l d1,(a1)+ move.l d2,(a1)+ move.l d3,(a1)+ move.l d4,(a1)+ move.l d5,(a1)+ move.l d6,(a1)+ move.l a2,(a1)+ move.l a3,(a1)+ move.l a4,(a1)+ move.l a5,(a1)+ move.l a6,(a1)+ ; Speed: 5.472 MB/s movem.l (a0)+,d0-d6/a2-a6 move.l d0,(a1)+ move.l d1,(a1)+ move.l d2,(a1)+ move.l d3,(a1)+ move.l d4,(a1)+ move.l d5,(a1)+ move.l d6,(a1)+ move.l a2,(a1)+ move.l a3,(a1)+ move.l a4,(a1)+ move.l a5,(a1)+ move.l a6,(a1)+ ; Speed: 5.472 MB/s movem.l (a0)+,d0-d6/a2-a6 movem.l d0-d6/a2-a6,-(a1) ; Speed: 4.896 MB/s rept 16 move.l (a0)+,(a1)+ endr dbra d7,.loop ; Speed: 4.656 MB/s rept 16 move.l (a0)+,d0 move.l d0,(a1)+ endr >imho this should do 7mb/sec in the store part. if the movem >is very fast, you aproximate the 7mb/sec also doing copying. 7 Mb/s is not possible. Remember that you have to access the same data-bus to read from FastRam. >On 020-14 it will be slower than normal copy, on 020-28 maybe already >faster (only theory!) >so we still need a test if it's faster than move.l (fast)+,(chip)+ >|> Here is my results from bustest: >|> >|> BusSpeedTest 0.07 (mlelstv) Buffer: 16384 Bytes >|> ================================================== >|> loop overhead: 4.5ns >|> register move: 40.6ns >huh ? a register move is 2 cycles. you got 24.63 MHz ? Ehh..No, I have 50 mhz. I tested it myself. (just to be sure) I was able to do 24400000 register move's and 203300 dbra's per second. A dbra is 3 times slower than a register move so that's 25.0 peek MIPS. 1.000.000.000 ns / 25.009.900 = 39.98 ns Check you numbers, its correct! >|> memtype op cycle bandwidth >|> fast readw 109.1ns 18.3MByte/s >|> fast readl 137.6ns 29.1MByte/s >|> fast readm 167.7ns 23.8MByte/s >readm slower ? hmhmhmhm. nooo. Ohhh-yes.. Just look at the copy results. >if you use enough regs it's faster on 020-14. >also reading from chip is faster with readm... mhmhm >|> Please not that this is write-speed and NOT copy speed. >|> >|> I did a simple test and I was able to copy 4.9mb/sec from fastram to >|> >|> chipram on a 256 colors screen. >|> >what size ? overscan, 320x256,320x200 ? pal ? >|> PAL-lowres, no overscan. >maybe you can write in 10 14-mhz cycle parts, i.e 5.672mb/sec theoretic >if no dma at all. Yes, write 5.6 MB/s to ChipRam with no DMA. >|> >|> I don't get it??? ;) It is optimal. That was actually VERY easy. >|> >|> Remeber that we are taking about 2x2 without sprite-dithering!! >|> >either you misuse a plane as mask, so 128 colors only, or the 2x2 >|> >routine is slower than 3pass, i.e. not optimal :) >|> The blitter uses only 2 passes and *is* optimal. And its 256 colors. >hehe, 2 passes blitter is slower than 1 pass blitter ;) >there's a misunderstanding in the word "optimal". >while there realy might be no way to do faster the way you do, >there's another way to do it :) the other way has disadvantages >(monitor sideefects ;) though. We support both 2xN sprite-dithering (1 pass) and normal 2xN (2 pass). If your render-routine is 25 fps or slower using the 2 pass version doesnt matter at all in speed and framerate. You only get a better-looking display. Can you explain about that monitor side-effects stuff you are talking about. Is this something new? <sb>Ludde - Amiga Demo Coder <sb>Virtual Reality & Official Be developer <sb>ludvigp@ifi.uio.no